Molecular libraries

ABSTRACT

The present invention provides a method of screening a library of peptides of formula M-G/M/V—(X) n  wherein n is an integer from 3 to 18, M is methionine, G is glycine, V is valine and each X, which may be the same or different, is any genetically encoded amino acid, which method comprises: 
     a) transforming a host cell population with a library of nucleic acid constructs which can express free peptides of formula M-G/M/V—(X) n ;    b) culturing the transformed host cells under conditions suitable for intra-cellular expression of the peptides of formula M-G/M/V—(X) n ; and    c) analysing the host cell population to determine the effect of the peptides of formula M-G/M/V—(X) n  on a reporter system. Also describe are a library of nucleic acid constructs which can express free peptides in an intra-cellular environment, said peptides having the sequence M-G/M/V—(X) n , wherein n is an integer from 3 to 18, M is methionine, G is glycine, V is valine and each X, which may be the same or different, is any genetically encoded amino acid; and a method of generating a library of nucleic acid constructs which can express free peptides in an intra-cellular environment, which method comprises synthesising a library of DNA molecules which include the nucleotide sequence ATGGGA(NNK) n , wherein n is an integer from 3 to 18, N is A, C, T or G and K is G or T and wherein each NNK triplet may be the same or different, and inserting a library of synthesised DNA fragments which each includes a nucleotide sequence of formula ATGGGA(NNK) n  into expression vectors.

The present invention relates to molecular libraries, in particular tomolecular libraries which can be screened to identify members of thelibrary with certain characteristics.

The complex interactions and pathways between biomolecules in livingorganisms are the focus for much research, particularly as scientistsstrive to develop new candidate drugs for the diagnosis, prevention ortreatment of disease. Of particular interest are proteins which havefunctionally active surfaces and pockets which are characterised by thethree dimensional shape of the protein molecule and particular reactiveamino acid side groups. These proteins are able to interact, associateor bind with a variety of other biomolecules, including other proteins,nucleic acids, organic molecules, ions etc.

There is a great desire to identify molecules which can undergo specificinteractions with target molecules, e.g. proteins, such molecules may beable to act as agonists or antagonists of enzyme or receptor activityfor example, and more recently it has been proposed that the threedimensional shape of target proteins may be altered through interactionswith small peptides (Nature Biotechnology letter El-Gewely 1999, Vol 17,p 210). In this way the wild-type function of a mutant protein can berestored.

A variety of molecular libraries which can be screened to identifymolecules with desired properties are known in the art. These maycomprise organic chemicals synthesised by traditional methods orchemicals produced by combinational chemistry. It is expensive togenerate very large libraries of such organic chemicals even usingcombinatorial techniques and so interest has turned to biologicalsystems.

The best known of these is phage display where peptides are expressed onthe surface of a filamentous phage as a fusion protein with a coatprotein. Ideally each phage will carry only one such peptide (althoughit will generally have several copies of that molecule) on its surfaceand these are then screened in vitro for their ability to bind to atarget ligand. The phage provide a useful link between genotype andphenotype, as the phage which has a useful ligand on its surface willalso have the nucleic acid inside it which codes for that peptideligand. The phage can be cultured, the nucleic acid amplified and thensequenced to determine the identity of the ligand exhibiting desirableproperties. However, only ligands in the form of fusion proteins withcoat proteins can be investigated according to this method and it isonly possible to perform a rather limited in vitro binding assay. Muchmore useful information could be generated if it was possible to analysethe performance of the members of the molecular library in vivo, in anenvironment similar to that which would be found inside the cells wherethe molecule is intended to act.

Another biologically based system which provides a molecular library isknown as Aptamers/SELEX (Ellington and Conrad 1995, Gold et al. 1997etc. SELEX stands for Systematic Evolution of Ligands by ExponentialEnrichment. According to this technique, single or double strandednucleic acid molecules that bind to other molecules are investigated,e.g. in order to find a small molecule—in this case a nucleic acidmolecule, that binds to a target protein. The short nucleic acidmolecules with strong and specific binding capabilities are the“aptamers”. Again, these systems only allow in vitro screening ofmolecules.

Thus, there remains a need for a relatively cheap molecular librarywhose members can be analysed in vivo without the need to generatefusion proteins which as a whole may not reflect the potential bindingactivity of the individual library molecules themselves. These problemshave been addressed by the present invention, which according to oneaspect provides a library of nucleic acid constructs which can expressfree peptides in an intra-cellular environment, said peptides having thesequence M-G/M/V—(X)_(n), wherein n is an integer from 3 to 18, M ismethionine, G is glycine, V is valine and each X, which may be the sameor different, is any genetically encoded amino acid.

The above defined free peptides are optionally expressed as a fusionpeptide with a cleavable signal sequence responsible for localization ortranslocation of the peptide. Such signal sequences are well known inthe art and will typically be no more than 30, preferably no more than20 amino acids in length. The term ‘free’ will be understood with thisin mind and therefore constructs whose initial translation productincorporates a peptide of formula M-G/M/V—(X)_(n) together with acleavable signal sequence are still considered ‘constructs which canexpress free peptides’ of sequence M-G/M/V—(X)_(n). The free peptidesencoded by the libraries of the present invention can be contrasted withlibraries of proteins which comprise peptides of interest fused to alarge carrier protein, e.g. designed for surface display of the peptidesof interest.

The integer represented by the letter n in the above formula is from 3to 18, preferably from 3 to 10 and typically from 3 to 5.

Thus in each of the expressed peptides the first residue is methionine,the second residue is glycine, methionine or valine and the third andsubsequent amino acids may be any of the genetically coded amino acids.The universally accepted single letter code system for amino acids isused throughout.

It is well known that there are 20 genetically coded amino acids and anyof these may be present in the encoded peptides. There will be between 3and 18 X residues which may be the same or different and it is envisagedthat for a given library, once the desired length of the peptides hasbeen set, representatives of most if not all available combinations ofresidues will be present. Thus for example, using the single letter codefor amino acids, where the desired peptides are pentamers, the librarywould encode most or all of the following peptides, which are themselvesonly listed by way of example: MMACD (SEQ ID NO.: 1) MMACE (SEQ ID NO.:2) MMACF (SEQ ID NO.: 3) MMACG (SEQ ID NO.: 4) MMADE (SEQ ID NO.: 5)MMADF (SEQ ID NO.: 6) MMADG (SEQ ID NO.: 7) MGACD (SEQ ID NO.: 8) MGACE(SEQ ID NO.: 9) MGACF (SEQ ID NO.: 10) MGACG (SEQ ID NO.: 11)

etc. Preferably the second amino acid is glycine.

A library of constructs according to the invention may encode peptidesof different lengths but typically a library will encode only peptidesof 1-3 different lengths, often of only one length. Thus a library willtypically have constructs which encode just pentamers or just hexamersetc. The nucleic acid encoding these peptides are referred to herein asmicrogenes or forming a microgenex library.

The present invention provides a library of constructs which in turn iscapable of generating a library of peptides which can be screened fortheir activity in a given intra-cellular environment. Thus, according tothe present invention, the library of molecules which is actuallyinvestigated/screened is made up of peptides, the library as a wholeproviding a repertoire of different shapes. The “shapes” will also haveregions of positive and negative charge and various reactive groupswhich will impact on their in vivo activity. The peptides are putativeligands for one or more target proteins and are conveniently referred toherein as a ‘ligand repertoire’ when either in peptide or correspondingnucleic acid form.

In a further aspect the present invention provides a (synthetic) libraryof peptides, each member of said library having the amino acid sequenceM-G/M/V—(X)_(n), wherein n is an integer from 3 to 18, M is methionine,G is glycine, V is valine and each X, which may be the same ordifferent, is any genetically encoded amino acid. The library issynthetic in that it has been generated through the use of recombinantDNA technology and is not naturally occurring.

The libraries of the present invention have at least 50, typically atleast 100, preferably at least 500 members more preferably at least2000. The members of the library are usually screened at the same orsubstantially the same time, e.g. the library members are in contactwith the host cells at the same time or at least screened as part of aseries of coordinated multiplex assays. The library may be screened alltogether or in batches but in that case at least 8, preferably at least64 or 96 different library members would still be contacted with thehost cell population at the same time. The different nucleic acidconstructs are recognisably part of a library as they will generallydiffer only in the region which encodes the peptides of formulaM-G/M/V—(X)_(n) defined above; thus the constructs, which are typicallyplasmid vectors, will contain the same flanking sequences i.e. promoteretc. on either side of the sequence encoding the peptide of interest.

The term “nucleic acid construct” refers to a nucleic acid moleculewhich contains a coding region for the peptides as defined hereinoperably linked to a promoter region. The constructs are typically inthe form of expression vectors, e.g. plasmid vectors and otherexpression vectors such as phagemid or lambda expression vectors canalso be used. The constructs are designed to allow expression inprokaryotic or eukaryotic host cells and thus a mammalian, bacterial oryeast etc. expression vector is selected as appropriate. Suitableexpression vectors for use with different cell types are well known inthe art. The library of peptide molecules encoded by the library ofconstructs can be analysed for their ability to interact with a giventarget system or protein. If the target protein is mammalian it followsthat the library will ideally be expressed in mammalian cells and theexpression vector will be selected accordingly.

According to a further aspect, the present invention also provides amethod of generating a library of nucleic acid constructs which canexpress free peptides in an intra-cellular environment, which methodcomprises synthesising a library of DNA molecules which include thenucleotide sequence ATGGGA(NNK)_(n), wherein n is an integer from 3 to18, N is a nucleotide selected from A, C, T or G and K is a nucleotideselected from G or T and wherein each NNK triplet may be the same ordifferent, and inserting a library of synthesised DNA fragments whicheach includes a nucleotide sequence of formula ATGGGA(NNK)_(n) intoexpression vectors.

Thus the invention provides libraries for use in any host cell for whicha suitable promoter to drive expression is known. Modifications tonucleic acid libraries disclosed herein which would be necessary forexpression in a different host cell system are within the competence ofthe skilled man, so a range of bacterial, eukaryotic or yeast compatiblelibraries may readily be developed. The Shine-Dalgarno sequence which isrequired for expression in bacteria does not hinder expression ineukaryotic cells.

It should be understood that the construct library is capable ofgenerating a repertoire of different molecules with different threedimensional configurations and different functional groups. As with achemical library, this same library of peptides can be used in manydifferent screening methods; whether it be for allosteric inhibition ofan enzyme, the ability to competitively inhibit the activity of anintra-cellular messenger at a receptor binding site, to control nucleicacid transcription or translation, or to restore the function of amutant protein through interaction therewith which alters the mutantprotein's three dimensional shape.

Whereas a phage display library is used in an in vitro binding assay,the libraries of the present invention, because the peptides areexpressed as free peptides inside the cell, may be used in an in vivoscreening method to investigate the activities of the peptide ligands innormal cell conditions. Thus toxic molecules may be excluded at an earlystage in testing.

The cells in which the library of peptide molecules are to be expressedcan internalise the expression vectors into which the ligand repertoirehas been constructed and there is thus no need either to synthesizepeptides or purify peptide libraries. Preferably, the expression vectorsare capable of autonomous replication to prevent the dilution oftransfected ligand-encoding plasmids during cell division or theintegration of the (e.g. plasmid) vector into the host cell chromosome,which causes difficulties in sequence analysis of the region of thenucleic acid within a cell which encodes the peptide ligand of interest.

Recombinant technology to generate expression vectors is well known anda suitable method for generating the library of constructs incorporatingthe desired sequence variation is described in the Examples. Thevariation in the nucleic acid sequence which translates into thedifferent peptide ligands is typically generated through the synthesisof a master primer which has regions of known sequence flanking theregion which encodes the peptide of formula M-G/M/V—(X)_(n) definedabove (see FIG. 2). As the first amino acid is always methionine, thefirst three nucleotides will be ATG, in the preferred case where thesecond residue is glycine, the next three nucleotides will be GGA andthe remaining triplets follow the formula NNK, wherein N is A, C, T or Gand K is G or T. This can allow for any nucleotide in any of the Ypositions.

The primer is generated adding particular nucleotides in turn until theNNK regions are reached when all 4 nucleotides are added in equal molaramounts for the addition of 2 nucleotides to the growing strand and thenG and T are added in equal molar amounts for the third position in thetriplet. This cycle of additions is repeated until nucleic acid encodingfor the peptides of the required length is generated. There are manycommercially available machines or services for synthesis ofoligonucleotides, e.g. as provided by Sigma and Life Technologies.

This technique provides a simple method for generating a vast range ofpotentially active biomolecules. The NNK motif of nucleotide additionmeans that all of the standard amino acids can be incorporated into thepeptides in any combination, so the peptides can include all the rangeof reactive side groups and overall variation in size, shape, polarity,hydrophobicity etc. which are provided by the amino acids individuallyand by combinations of amino acids in juxtaposition.

In a further aspect, the present invention provides a method ofidentifying an active peptide ligand of formula M-G/M/V—(X)_(n) asdefined above which comprises:

a) transforming a host cell population with a library of nucleic acidconstructs as defined herein;

b) culturing the transformed host cells under conditions suitable forintra-cellular expression of the free peptides of formulaM-G/M/V—(X)_(n);

c) analysing the host cell population to determine the effect of theexpressed free peptides on a reporter system; and optionally

d) identifying the peptide of formula M-G/M/V—(X)_(n) in those cells inwhich the reporter system indicates a positive response.

The present invention further provides a method of identifying a peptideligand of formula M-G/M/V—(X)_(n) as defined above having a desiredactivity on a target protein, which method comprises:

a) transforming a host cell population with a library of nucleic acidconstructs which can express peptides of formula M-G/M/V—(X)_(n);

b) culturing the transformed host cells under conditions suitable forintra-cellular expression of the peptides of formula M-G/M/V—(X)_(n);

c) analysing the host cell population to determine the effect of theexpressed peptides on said target protein, which target protein formspart of a reporter system; and optionally

d) identifying the peptide of formula M-G/M/V—(X)_(n) in those cells inwhich the reporter system indicates a positive response. A positiveresponse in said reporter system indicating that one or more peptides inthat cell has the aforementioned desired activity. The desired activitywill typically be an ability to restore wild-type activity in the targetprotein, the host cells having a non-wild-type mutant form of the targetprotein.

Alternatively viewed, the present invention provides a method ofscreening a library of peptides of formula M-G/M/V—(X)_(n) wherein n isan integer from 3 to 18, M is methionine, G is glycine, V is valine andeach X, which may be the same or different, is any genetically encodedamino acid, which method comprises:

a) transforming a host cell population with a library of nucleic acidconstructs which can express free peptides of formula M-G/M/V—(X)_(n);

b) culturing the transformed host cells under conditions suitable forintra-cellular expression of the peptides of formula M-G/M/V—(X)_(n);and

c) analysing the host cell population to determine the effect of thepeptides of formula M-G/M/V—(X)_(n) on a reporter system. The reportersystem is fully or partially an intra-cellular system.

These same method steps could also be considered to define a method ofscreening a library of nucleic acid constructs. The ‘microgenes’encoding the library of peptides may be expressed constitutively orexpression can be induced from a promoter which is operably linked tosaid microgene. The constructs may be autonomously replicated orintegrated in the host chromosome.

We describe methods of “screening a library”. While the library may havebeen selected or designed to have certain structural motifs and thenature of individual members of that library can be assumed, this phraseimplies that it is not known which member is introduced into which cell(or cell sample) and the molecules are not introduced in a controlledseries.

Suitable transformation and culturing techniques are well known in theart and some are discussed in detail in the Examples hereto. Suitablereporter systems are discussed in detail below and will depend on thedesired activity of the ligands. The reporter system will convenientlyinvolve up or down regulation of gene expression, either at the level oftranscription or translation. The gene whose level of expression ismonitored may be native to the host cell but is typically an exogenousreporter gene. The reporter system will incorporate a target proteinwhose activity can be altered through interaction with one or moremembers of the library of peptides defined and described above. Bindingor association of the peptide ligand to or with the target protein will,in some cases, result in a conformational charge in the target proteinand where this results in a change in activity of the target proteinthis may be observed through the reporter system. Suitable targetproteins are discussed in more detail below and include enzymes andtranscription factors. The reporter system may be directly or indirectlyrelated to the desired ligand function. For example, if the ability of aligand to activate transcription or translation of a gene conferringantibiotic resistance is being investigated, culturing the host cells inthe presence of that antibiotic would readily identify successfulligands as cells which expressed non-successful ligands would not grow.Given the size of the peptides, activation is unlikely to be throughdirect nucleic acid binding but through interaction with the proteinmachinery which performs and/or controls transcription/translation.

Thus analysis of the host cell population may simply be by observation,e.g. of growing plaques or through detecting a particular colour, forexample if the β-gal reporter system is used.

In step d) the active peptide is typically identified by analysis of thenucleic acid encoding it. The Examples provide a discussion of how thenucleic acid of the constructs from selected host cells may besequenced. For a given host cell, the link between phenotype of thepeptide ligand in terms of its impact on the reporter system andgenotype of the microgene encoding said ligand allows convenientidentification of ligands exhibiting the desired activity.

A library of molecular shapes, i.e. ligands, for in vivo screening has awide variety of uses. Of particular interest is the potential formembers of a library to encode peptides which can take part in proteintherapy through protein function restoration. It has recently been shownthat while a mutant protein may have an altered three-dimensionalstructure due to a non-wild type primary structure, thethree-dimensional structure can be corrected without changing theprimary structure. Small molecules, e.g. peptides, may be used whichinteract with the mutant protein to modify its three dimensionalstructure and thus its activity. Proteins are known to change theirconformation on binding or associating with a range of different classesof molecules, e.g. other proteins, nucleic acids, prosthetic groups suchas heme, ions such as Ca²⁺ etc.

Peptide ligands identified according to the above described method canbe used therapeutically to alter the three-dimensional structure of amutant protein and thus restore or alter the function or activity ofthat mutant protein. Typically the ligand can restore, in whole or inpart, wild type protein function. Peptide ligands identified accordingto the above screening method may be chemically modified before beingused in therapy, in particular to enhance in vivo stability or byintroduction of targeting or signal moieties such as the HIV Tat tag orfolate, the receptor for which is over-expressed in many cancer cells.There are standard techniques which can be used to achieve this e.g. Cterminal modifications such as amidation or N terminal modifications orthe use of D- instead of L-amino acids.

It has been found that very small peptides can restore or modify thefunction of target proteins through interacting with them in vivo andcausing changes in their conformation. The free peptides expressed bythe library of nucleic acid constructs will preferably have only 3-8amino acids, particularly preferably 3-6 amino acids. If a signalsequence is used, these preferred lengths refer to the peptidesgenerated after cleavage of the signal sequence.

In a further aspect, the present invention therefore provides for theuse of a library of nucleic acid constructs as defined herein in amethod of screening a library of molecules for the ability of members ofthat library to restore or modify the function of a target protein in anintra-cellular environment, which method comprises introducing thelibrary into cells which have a reporter system which allows theidentification of those cells in which the function of the targetprotein has been restored or modified.

Thus the reporter system is compatible with, or more particularlyincludes, the target protein. The reporter system preferably comprises areporter gene which is operably linked to a sequence of nucleotideswhich provides a binding site for the target protein or for a proteinwhich associates with or is a substrate for said target protein.

The screening methods of the invention give information about individualmembers of the library of molecules. The method is performed in anintra-cellular environment (i.e. in vivo), and therefore all toxiccompounds will be weeded out very early on. This reduces the cost ofscreening and of pre-clinical and even phase I clinical trials. Themethod is able to indicate in a qualitative possibly also quantitativemanner the performance of individual molecules in the test system. Atarget protein and suitable host cells are selected and the methoddesigned so that the reporter system is able to give information aboutthe ability of each molecule to restore or modify the function of thetarget protein.

The reporter system is typically designed to measure restoration of aparticular function of the target protein, for example the ability toact as a DNA binding transcription factor. Thus restoration of functionrefers to a return to wild-type function of the protein in at least onephysiologically relevant and measurable respect.

In many cases, the wild-type functions of the target protein in allrespects may be restored but this is not essential for the successfulworking of the invention. It will be appreciated that partialrestoration of the function of a protein may still be useful and it isnot a requirement for a positive identification according to the claimedscreening method to provide 100% of wild-type activity.

The sensitivity of the reporter system may conveniently be modified toprovide a more or less stringent assay and thus to identify as giving“positive” results only those molecules which have achieved asignificant increase in protein function. Preferably, molecules givingpositive results according to the claimed screening methods will havebrought about at least a 30%, more preferably at least a 50%,particularly preferably at least a 70% restoration in wild-type proteinfunction, with respect to the particular function monitored by theintra-cellular reporter system. In some instances, a ligand may correctthe mutant to a level of measured activity which is actually greaterthan wild type. It will be clear to the skilled reader that the artprovides a number of ways of comparing the relative activities ofdifferent molecules. Some reporter systems e.g. those based onfluorescence may be quantitative and thus facilitate such a comparisonbetween wild-type and corrected mutant. However an assessment relativeto the wild-type protein may more conveniently be performed as part of aseparate test utilising serial dilutions, measurement of plaque sizeetc.

Suitable target proteins for restoration of function will include thosewhose presence in mutated form is associated with a disease state. Themutated version of the protein will typically have an altered3-dimensional structure which affects its ability to interact with othermolecules (ions, intra-cellular organelles and other cell componentsetc.). The members of the molecular library will be screened for theirability to interact with the mutant target and thus alter 3-dimensionalstructure. The 3-dimensional structure in vivo may be closer in one ormore respects to the wild-type 3-dimensional structure as a result ofsuccessful interaction with a member of the library but protein functionmay be restored through a further compensating change in 3-dimensionalstructure which has a return to wild-type function as a ‘net’ -result.Preferred target proteins will be those wherein point mutation(s) arethe cause of disease. Particularly preferred are transcription factors(DNA binding proteins) such as p53, but others would be enzymes, peptidehormones or receptor molecules.

All diseases caused by monogenic Mendelian mutations could be targetsfor treatment with molecules identified according to the screeningmethods of the present invention. These include genetic (i.e.hereditary) diseases, cancer and symptoms of aging caused by mutations.The host cells should provide a suitable model for determining whetherthe members of the library are capable of causing the desired change intarget protein function with a view, typically, to therapeuticadministration of successful members of the library or a derivativethereof.

Thus, where protein function restoration is required, the host cellswill typically be derived from cancerous or other diseased cells andnaturally produce the dysfunctional target protein. Suitable cell linesare available, for example in cell culture collections such as the ATCCand may be derived from osteosarcoma, adenocarcinoma etc., or indeed anyestablished cell line that expresses the mutant protein. Cell lines thatlack any expression of the protein of interest can also be used afterintroducing and expressing the gene encoding the mutated protein.

Standard transfection techniques may be used to introduce the nucleicacid constructs into the host cells in which expression of the freepeptide ligands is to take place. The terms ‘transformation’ and‘transfection’ are used interchangeably herein.

The appropriate reporter system for a given screening method will dependupon the nature of the target protein under investigation. The reportersystem is chosen to be responsive to the target protein and thus becapable of indicating the functional status of the target protein. Themammalian p53 protein illustrates the principle behind a suitablereporter system and is a particularly preferred target protein. P53functions in mammalian cells mainly through the ability to act as atranscription factor and is a transcriptional activator of several genesassociated with cell cycle regulation such as p21, Bax, CD95(Fas/Apo-1)and 4-3-3σ (HME1). It can also down regulate the transcription ofcell-cycle-regulating genes such as Bcl2, Cdc2 and cyclin. P53 acts as acheckpoint, monitoring DNA damage and regulating cell cycle progression.Loss of p53 activity predisposes cells to the acquisition of oncogenicmutations and may favour genetic instability −90% of mutations reportedin the p53 mutation database are found in the DNA binding domain.Mutations in p53 can lead to cancer formation where p53-mediatedapoptosis is deficient due to the mutation.

P21 is one of the p53-transactivated genes that are critical in cellcycle control and many p53 mutants found in cancer cells lose theability to transactivate p21 transcription. Thus the transactivationlevel of p21 promoter in a cell reflects whether the p53 proteinexpressed in that cell has wild type function. As discussed in moredetail in the Examples, a p53 reporter system was constructed by cloninghuman p21 promoter upstream of the puromycin resistance gene. Thereporter construct can provide human cells with puromycin resistanceonly in the presence of wild type or wild-type functioning p53 thattransactivates the p21 promoter. Such a reporter system offerssignificant practical benefits as a life-death selection system ispossible where only cells with wild-type-functioning p53 can survivewhen grown in the presence of puromycin.

Thus, while the host cells may have a reporter system as part of theirnormal genome which can be utilised, typically the host cells will havebeen modified to include a suitable gene based reporter system. Thegenetic constructs which comprise all or part of the reporter systemwill thus have been introduced into the host cells, e.g. by standardtransformation/transfection techniques, before the screening method isperformed. In certain circumstances, the reporter constructs could beintroduced at the same time as the molecular library or even, but notpreferably, after introduction of the library. Thus for performance ofthe screening methods of the invention, host cells will preferably havebeen co-transfected (although not necessarily simlutaneously) with areporter construct and a construct which encodes a peptide whose abilityto restore or modify the function of a target protein is to beinvestigated. Alternatively, members of a chemical compound library maybe contacted with the host cells which have been transfected withreporter constructs.

Target proteins are preferably DNA binding proteins which up or downregulate the expression of other genes through binding to promoter orenhancer regions. Their native gene targets may be utilised to report onwhether a target protein function has been restored or modified, butpreferably their target DNA binding regions will be operably linked toreporter genes such as genes conferring antibiotic resistance, or whichencode fluorescent proteins such as GFP (green fluorescent protein), orthe much used bacterial reporter enzymes β-galactosidase (β-Gal) orβ-glucuronidase (β-Gus) etc. According to a particularly convenient andpreferred method the protein products of these reporter genes (e.g.β-Gal or β-Gus) could be fused to a signal peptide in order to ‘display’the proteins on the cell surface. In this way, the cells expressing thereporter protein can readily be separated physically from other cellsthat do not express the reporter protein, e.g. by the use of antibodies.Suitable signal peptides are described in the literature and include thesignals from a protein such as PDGFR (platelet-derived growth factorreceptor). A typical construct would thus be as follows:

Secretion signal peptide-reporter protein (eg.β-Gal/β-Gus)-transmembrane domain; e.g. METDTLLLWVLLLWVPGSTGD (SEQ. ID.NO.: 12) -β-Gal/β-Gus- AVGQDTQEVIVVPHSL PFKVVVISAILALVV (SEQ. ID. NO.:13) LTIISLIILIMLWQKKPR -stop.

Non-DNA binding proteins can also be used as target proteins providedtheir wild type activities can be monitored and assayed for or thereporter system engineered such that cell survival depends onrestoration of the protein function.

Identification of cells in which the function of the target protein hasbeen restored or modified(e.g. wild-type function has been restored)will depend on the reporter system used. As described above, this isparticularly conveniently performed when cells in which wild-typefunction has not been restored do not survive in the selected culturingenvironment. Fluorescence detection can also be used to identify cellsin which the function of the target protein has been restored ormodified. The target protein may, for example, be an enzyme rather thana nucleic acid binding protein and the ability of the target enzyme toact on a substrate may be used as the intra-cellular reporter systemallowing identification of cells in which the function of the targetprotein has been restored or modified. Other suitable reporter systemsuch as the β-gal. blue/white system are well known in the art and canbe used or adapted by the skilled man in his chosen screening method.

A library of nucleic acid constructs according to the present inventionis conveniently referred to herein as a “Microgenex” library and thestretches of nucleic acid which encode the peptide ligands of interestas “Microgenes”.

A further discussion relevant to the context and appreciation of thepresent invention is found in our co-pending International applicationfiled 23 Jan. 2003 and claiming first priority from GB 0201522.0.

The invention will now be described in more detail in the followingnon-limiting Examples and with reference to the figures in which:

FIG. 1 shows in schematic form the general strategy for construction ofa library of nucleic acid constructs according to the present invention.

FIG. 2 illustrates the strategy of primer design for construction of aMicrogenex library. Restriction enzyme sites within primers as well asthe encoded amino acids are shown below the primer sequences.M=methionine, G=glycine and X=any of the 20 genetically coded aminoacids.

FIG. 3 illustrates the construction of a peptide library for expressionin mammalian cells.

FIG. 4 illustrates the construction of a peptide library for expressionin bacteria (E. coli).

FIG. 5 gives a map of the constructed Microgenex library of the Examplesthat can be used for expression in mammalian cells.

FIG. 6 gives a map of the constructed Microgenex library of the Examplesthat can be used for expression in E. coli. “Cm-R” refers tochloramphenicol resistance gene. “p15A” is replication origin compatiblewith several other replication origins used in molecular biology andbiotechnology. P_(BAD) promoter is suppressed by araC but can be inducedby L-arabinose. Peptide library coding sequence was inserted under theP_(BAD) promoter between KpnI and HindIII sites. SD sequence is aribosome-binding site to help in the translation of the peptides in E.coli. The first two amino acids at the N-terminal (“M” for Met, and “G”for Gly) can stabilize the peptides inside cells. The two stop codonsassure the termination of translation while the terminator (“term”)sequence will terminate the transcription driven by the P_(BAD)promoter. The DNA fragment of the peptide library was generated by PCR,using Master, Right and Left primers as described in the Examples.

FIG. 7 provides a schematic representation of the generated peptidelibrary plasmid. Each peptide in the library is composed of three randomamino acids after two N-terminal amino acids Methioline (Met, or M) andGlycin (Gly, or G). The next three X stands for any amino acid thatforms a library consisting of randomly distributed three-amino-acidpeptides. The peptide-coding DNA-fragment was inserted into themammalian expression vector pCEP4 between its KpnI and HindIII siteunder the CMV promoter. The plasmid has OriP that enables thereplication of the plasmid inside mammalian cells. Selection markerexpression cassette in the plasmid provides the plasmid-transfectedcells with hygromycin resistance.

EXAMPLES Example 1 Strategy for Microgenex (Nucleic Acid Library)Construction

Materials and Methods:

1—Primers.

All primers used to construct Microgenex libraries are listed inTable 1. TABLE 1 Primers: Primers used to construct Microgenexlibraries. No Name Sequence bases 1 MasterAAGAGCTCGGTACCAAGAAGGAGTTTACATATGG 71 GANNKNNKNNKTGATAAGGATCCAAGCTTGAATTCAG (SEQ. ID. NO.: 14) 2 Left AAGAGCTCGGTACCAAGAAGGAG 23 (SEQ. ID. NO.:15) 3 Right CTGAATTCAAGCTTGGATCCTTATC 25 (SEQ. ID. NO.: 16)

Primers used for DNA sequence are listed in Table 2. TABLE 2 TheSequencing primers used for the Microgenex verifications. No NameSequence bases 1 pCEP-f AGAGCTCGTTTAGTGAACCG 20 (SEQ. ID. NO.: 17) 2pCEP-r GTGGTTTGTCCAAACTCATC 20 (SEQ. ID. NO.: 18) 3 Left*AAGAGCTCGGTACCAAGAAGGAG 23 (SEQ. ID. NO.: 19) 4 Right*CTGAATTCAAGCTTGGATCCTTATC 25 (SEQ. ID. NO.: 20)*These primers can also be used to verify sequence of any Microgenexlibrary. Other primers within the used cloning vector can also be used.2—Cloning Vectors/Plasmids

-   -   pCEP4 (Invitrogen): for mammalian expression.    -   pBAD33 (Dr. Beckwith, Harvard; Guzman et al., 1995 J. Bacteriol.        177(14) 4121-4130) for Escherichia coli expression.        3—Escherichia coli Strain Used for Transforming Constructed        Microgenex Libraries:

The Electrocompetent DH10B™ (Life Technologies, GibcoBRL) was used totransform the ligated Microgenex expression libraries.

4—PCR Conditions for Microgenex Amplifications

4.1. Concentrations of Primers: Master primer  25 PM Right primer 100 PMLeft primer 100 PM

4.2. PCR Reactions:

At least two different Polymerases gave good results: Taq polymerase(Life Technologies, GibcoBRL) and rTth DNA polymerase (Perkin Elmer).

-   -   4.2.1. Standard Taq polymerase as for example in SuperMix (Life        Technologies, GibcoBRL).    -   Prepare 1 to 10 PCR tubes with the following:        -   90 □l SuperMix        -   2 □l Master Primer (25 PM)        -   4 □l Right Primer (100 PM)        -   1 Left Primer (100 PM)    -   4.2.2 The XL PCR, extra long (rTth DNA polymerase) (Perkin        Elmer) Also this PCR polymerase worked very well after adding        buffers according to the manufacturer, but using primers as        above and also the PCR cycles as indicated below.    -   4.2.3. PCR Cycles

Best Conditions were found by making a hot start at 95 for 3-5 minutes.PCR cycles were preformed only using 25 cycles. Example for PCR Cycle:Hot start 95° C. 3 min. 25 cycles Denaturation 94° C. 1 min. Aneaing 38°C. 1 min. Elongation 60° C. 1 min. Hold at  4° C.5. Ligation of PCR Products to the Corresponding Expression Vector:

PCR product was first treated with phenol-chloroform, chloroform andthen precipitated with 3 volumes of cold ethanol. The DNA pellet waswashed with cold 70% ethanol, dried and dissolved in appropriate volumeof H₂O. DNA was then restricted by KpnI and HindIII. Restricted DNA wasgel-purified and subsequently ligated to the corresponding expressionvector that has been also restricted with by KpnI and HindIII. Ligationmix was precipitated and resuspended in H₂O and subsequently used totransform, by electroporation, of the Escherichia coli strain DH10B™(Life Technologies, GibcoBRL). Plating was done on LB ampicillin plates(200 □g/ml) for pCEP4, or chloramphenicol (40 □g/ml) for pBAD33.

6. Pooling of Colonies:

Colonies from all the plates were pooled together, washed with LB withampicillin (200 □g/ml) if the cloning vector contains beta-lactamasegene, such in the case of pCEP4. This step is important to get rid ofcells that lack the plasmid. Pooled cell mixture was divided into smallaliquots and stored at −70□C. Each aliquot contained millions of cellsto ensure complexity of the Microgenex library.

7. DNA Isolation of Microgenex Library:

One frozen portion can be used to inoculate a one-liter culture of LBcontaining the appropriate antibiotic. Standard plasmid DNA isolationprotocols/kits are used to isolate the DNA suitable to transform, forexample mammalian cells for ligand screening.

Master primer was used as a template for all constructed peptideexpression libraries (Microgenex) whether the expression in mammaliancells or Escherichia coli. It could also be used for yeast or insectcell expression. Left and Right oligoes were used as forward and reverseprimers, in order to generate double-stranded DNA fragments encodingrepertoire of peptide library by PCR. The first ATG in this fragmentencodes Met as the translation initiation site of the peptides. Gly, thesecond amino acid encoding by the DNA fragment, assures the peptidesexpressed to be stable inside cells according to the N-end rule. Thefollowing three repeats of NNK code (N for A, C, G, and T in equal molarratio, K for G and T in equal molar ratio) for all possible amino acidsthus form the peptide library. Two stop codons, TGA and TAA, are addedright after the last NNK to stop peptide translation. The 5′-end KpnIsite and 3′-end HindIII allows the fragments to be cloned into pCEP4vector in the correct direction under CMV promoter.

The peptide-coding DNA-fragment was inserted into the mammalianexpression vector pCEP4 between its KpnI and HindIII site under the CMVpromoter. The plasmid has the replication origin, OriP, which enablesthe replication of the plasmid inside mammalian cells. Selection markerexpression cassette in the plasmid provides the plasmid-transfectedcells with hygromycin resistance.

After ligation of the KpnI/HindIII restricted PCR product, and theKpnI/HindIII restricted pCEP4 or pBAD 33, the DNA were electroporatedinto DH10B™ (Life Technologies, GibcoBRL) cells. Transformants wereplated on LB plates with ampicillin (200 □g/ml) or Chloramphenicol (40□g/ml) depending on the coloning vector used in the construction of theMicrogenex library. All colonies from all the plates were pooledtogether, washed with LB with ampicillin (200 □g/ml), only if thecloning vector contains beta-lactamase gene, such in the case of pCEP4.Pooled cell mixture was divided into small aliquots and stored at −70□C.Each frozen portion of the Microgenex library can be used to inoculateand start a 1-liter-culture (with the appropriate antibiotic, in orderto prepare the plasmid DNA. Insertions were verified by PCR of thegenerated Microgenex with Right and Left oligoes. Also, the forward andreverse primers of pCEP4 were used for verifications for the constructedmammalian Microgenex. Similar PCR reactions were preformed using plasmidDNA prepared from a small number of randomly-picked colonies in order toestimate the cloning efficiency and the ratio of peptide-encodingplasmids to empty vectors in any given Microgenex. Note that allgenerated peptides will start with M and G amino acids followed by threerandom amino acids (X, X, X) It should be noted that the number of therandom amino acids (X) can be changed by changing the number of thecodes (NNK).

Results

Colonies in the range of 25000-75,000 on the plates were commonlyobtained ensuring complexity/diversity of the cloned Microgenexrepertoire. They were collected into 10-12 portions.

Summary of Microgenex libraries are listed in Table 3. TABLE 3 Plasmidsused for the construction of microgenex libraries Plamid Name PromoterInsertion Selection marker Replication origin Reference. pCEP4 CMV No E.coli: Amp, E. coli: ColE, Vector from Mammalian: Mammalian: OriPInvitrogen Mammalian Microgenex library with Hygromycin B This work,FIG. 2.3. microgenex amino acid sequence as: library M-G-X—X—X. pBAD33pBAD No Chloramphenicol p15A Vector from Beckwith; Guzman, et al., 1995E. coli Microgenex library with This work. microgenex amino acidsequence as: library M-G-X—X—X.

Plasmid prepared from one of such frozen aliquots was examined by PCR.When mammalian Microgenex library DNA was used as a template in PCRreactions using pCEP4 forward and reverse primers showed that there wereinsertions with expected size. PCR results from the 6 randomly pickedcolonies from the original mammalian Microgenex library plates showedthat 4 of the colonies had the right construction (showed correct sizeof insertion) while two did not.

A test for the presence of Microgene insertions in a constructedmammalian library by PCR from 6 randomly picked colonies indicated that4 out of the 6 colonies have the right size of insertion. Also, PCRresults from 6 randomly picked colonies from the original E. coliMicrogenex library plates showed that 4 of the colonies had the rightconstruction, in that they showed the correct size of insertion.

Example 2 Demonstrating the Generation and Use of a Microgene Library inIdentifying a Protein Binding Ligand of Interest

A—Construction of p53 Reporter Working in Mammalian Cells

The dysfunction of mutant p53 in cancer cells can result from not onlythe inability in binding to its specific recognized consensus DNAsequence, but also the mal-localization inside cells or incompetent tointeract correctly with other transcription co-factors. The sub-cellularlocalization mechanism and transcription machinery in mammalian cells isdifferent from that in prokaryotic organisms. The loss of specific DNAbinding ability may engage the post-translational modification processesthat are totally different from that in prokaryotic organisms. Thus ap53 reporter for mammalian cells has more advantages than that forprokaryotic cells.

P53 protein functions in mammalian cells mainly through itstranscriptional factor activity. P21 is one of the p53-transactivatedgenes that are critical in cell cycle control. P53 can induce cell cyclearrest in G1 phase via transactivation of p21 gene once the cells areunder stresses such as DNA damaging. Lots of p53 mutants found in cancercells lose the ability to transactivate p21 transcription. This leads tonot only the vast development of cancer, but also the resistance tocancer chemotherapy and radiotherapy that aim to damage DNA in order tokill cancer cells. Thus the transactivation level of p21 promoter in acell reflects whether the p53 protein expressed in that cell has wildtype function or not.

We constructed our p53 reporter by cloning human p21 promoter upstreamof the puromycin resistant gene. It has been reported that basaltranscription level from p21 promoter is very low. The reporter canprovide human cells puromycin resistance only in the presence of wildtype or wild-type-functioning p53 protein that transactivates p21promoter, while leaving the cells sensitive to puromycin withdysfunctional mutated p53 protein. The advantage to use puromycinresistant gene is to set up a life-death selection: only the cells withwild-type-functioning p53 protein can survive from puromycin selection.

Materials and Methods

Plasmid and vectors used in this Example are WWP-Luc (gift from Prof.Stanbridge E. J.), pPUR (Clontech), and pUC18 (Pharmacia). Enzymes usedfor cloning are EcoRI (Promega), PvuII (Promega), SalI (Promega), SphI(Promega), T4 DNA ligase (New England BioLab), Klenow (Promega), and SAP(Shrimp Alkaline Phosphotase). Cell lines for the validation of ourreporter are U-2 OS cell (ATCC Number: HTB-96, Homo sapiens (human),osteosarcoma, bone, wild type p53), Saos-2 cell (ATCC Number: HTB-85,Homo sapiens (human), osteosarcoma, bone, p53 null), and SW480.7 (ATCCnumber: CCL 228, Homo sapiens (human), colorectal adenocarcinoma, colon,p53 mutant: R273H/P309S). Medium and other chemicals include McCoy'smedium, DME medium, MEM, FCS, calcium phosphate transfection reagents(0.1×TE (pH8.0), 2×HBS (pH7.4), 2M CaCl₂, 15% glycerol in HBS), PBS, andpuromycin (Sigma).

WWP-Luc plasmid was restricted by EcoRI and SalI. The 2366 bp fragmentwas gel-purified (1% agarose in 1×TBE) by freeze-thaw method. VectorpUC18 was restricted by EcoRI and SalI, dephosphorylized by SAP, andpurified by phenol-chloroform. Then the p21 promoter fragment wassubcloned into pUC18 vector by T4 DNA ligase, generating a plasmid namedas p21uc. All drug resistant markers in the constructed plasmids, arelisted in Table 1. Genes involved in the resistance are also listed.TABLE 1 Drug resistant markers in constructed plasmids and their genes.Drug Resistant gene Puromycin Puromycin-N-acetyl-transferase (pac)Neomycin Neomycin phosphotransferase gene Hygromycin B Hygromycin Bphosphotransferase Histidinol Histidinol dehydrogenase (hisD)

P21 promoter was cut out from p21 uc first by EcoRI, blunted with Klenowand dNTP, and then by SphI. The 2450 bp-p21-promoter fragment wasgel-purified using the same method as above. Vector pPUR was restrictedfully by PvuII and SphI, dephosphorylized by SAP, and gel-purified bythe same method, which removed SV40 early promoter upstream of puromycinresistant gene. Finally the p21 promoter was cloned into pPUR vectorupstream of puromycin resistant gene by T4 DNA ligase, generating thep53 reporter named as p21 ur.

Verification of the construction was carried out by restriction of thegenerated plasmid with HindIII. The original vector pPUR has only oneHindIII site thus gives one band of 4257 bp after gel electrophoresis.This HindIII site was removed while constructing the reporter p21ur. Twonew HindIII sites were introduced into p21ur with the p21 promoter thuscan give two bands, one at 2.4 kb and the other at 3.9 kb.

Validation of the responsiveness of p21 ur to p53 was carried out bytransfection of p21ur to osteosarcoma cell line U-2 OS that expresseswild type p53 protein, Saos-2 that is p53 null, and colon cell lineSW480.7 that expresses R273H and P309S double mutated p53 protein. Oneday before transfection, 3×10⁵ U-2 OS, 1.5×10⁵ Saos-2, and 3×10⁵ SW480.7cells were seeded to each well of a 6-well plate. Transfection wascarried out by the calcium phosphate method, with 25 μg of p21ur foreach well. A 2-minute glycerol-shock was applied 3 hours after DNA wasadded to the cells and the transfected cells were incubated in MEM/10%FCS at 37□C with 5% CO₂ overnight. The transfected cells weretransferred to new 6-well plates at the concentration of 5×10⁴ cells perwell on the next day, and incubated in growth medium (McCoy's medium/10%FCS for U-2 OS and DME medium/10% FCS for Saos-2 and SW480.7) withpuromycin. The same puromycin treatment was also given to thenon-transfected cells as a control. Refreshed the medium with puromycinevery three-day and observed growth status of the cells.

The cloned vector was restricted by HindIII, which proved that there isthe p21 promoter, a ˜2.4 kb fragment, inserted upstream of the puromycinresistant gene. DNA electrophoresis showed that the restriction gave twobonds with correct theoretical sizes at 2.4 and 3.9 Kb.

Transfection of the reporter p21 ur into U-2 OS cell line providedpuromycin resistance to the transfected cells (Table 2). Cellstransfected with p21ur kept alive after 5 days of puromycin treatment at0.5 μg/ml, while the untransfected cells died out. The transfected cellscould not survive from higher concentration of puromycin because of thelow expression level of p53 protein. Neither transfected Saos-2 cells(null p53) or SW480.7 cells (mutant p53) survived after 5 days ofpuromycin treatment at the concentration of 0.5 μg/ml or higher, whichproved that the resistance of p21 ur-transfected U-2 OS cell topuromycin requires the existence of wild type p53 that transactivatesp21 promoter. Thus it validated the proper responsiveness of thereporter p21 ur to p53 in human cells. TABLE 2 Validation of thereporter p21ur - its response to wild type p53 protein. Number of cellsattached to the bottom of Puromycin the well after 5 days of puromycintreatment (μg/ml) Untransfected U-2 OS Transfected U-2 OS 0 Growing well12600 0.5 0 11500 1 0 <5000 2 0 <5000 3 0 <5000B—Construction of Peptide Library Expression in Mammalian Cells

We have constructed a microgene library to provide a series of molecularshapes for testing. This ‘microgenex’ approach enabled us to screenamong the expressed peptide repretoire and identify some that couldcorrect mutated p53 and could restore all of its downstream activity.

Selection of the functional peptides can be carried out either in vivoor in vitro according to the peptide library. The advantage of in vivoselection over in vitro selection is that it allows the peptides tocarry out their function in an environment similar or even exactly thesame as where they will later be applied in therapy. This makes thepeptides more applicable and simplifies the modification step for theselected peptides. Besides, it can often detect weak and transientinteractions.

Expression of the peptide library allows in vivo screening for possiblepeptide ligands that can adjust mutant p53 protein back to its wild typefunction. Once the peptides were constructed into an expression vector,they were easily internalized by cells. With the help ofpositive-selection reporters, the possible ligands were also readilyidentified by the recovery of ligand-encoding plasmids from survivingcells. Moreover, according to this technique, there is no need to eithersynthesize peptides or purify peptides, which can be expensive andlabour intensive. An un-integrated, autonomously replicated mammalianexpression vector was chosen to prevent the dilution of transfectedligand-encoding plasmids during cell division, or the integration of theplasmids into the host chromosomes, which causes difficulties inrescuing ligand from surviving cells.

Materials and Methods pCEP4 (Invitrogen): TTGACGCAAA TGGGCGGTAGGCGTGTACGG TGGGAGGTCT ATATAAGCAG AGCTCGTTTA  CAAT         TATA  pCEPForward GTGAACCGTC AGATCTCTAG AAGCTGGGTA CCAGCTGCTA GCAAGCTTGCTAGCGGCCGC primer      KpnI    HindIII TCGAGGCCGG CAAGGCCGGA TCCAGACATG(SEQ. ID. NO.: 21) TTGATGAGTT TGGACAAACC              EBV Reverse primerACAACTAGAA Oligo REGP-10: 5′-AAGAGCTCGG TACCAAGAAG GAGTTTACAT ATG GGANNK NNK NNK TGA TAA    KpnI     M G X X X stop stop GGATCCAAG CTTGAATTCAG-3′ (SEQ. ID. NO.: 22)    HindIII Oligo REGP-11:5′-AAGAGCTCGGTACCAAGAAGGAG-3′ (SEQ. ID. NO.: 23)      KpnI OligoREGP-12: 5′-CTGAATTCAAGCTTGGATCCTTATC-3′ (SEQ. ID. NO.: 24)     HindIIIpCEP Forward primer: 5′-AGAGCTCGTTTAGTGAACCG-3′ (SEQ. ID. NO.: 25) EBVReverse primer: 5′-GTGGTTTGTCCAAACTCATC-3′ (SEQ. ID. NO.: 26) KpnI,HindIII, T4 DNA ligase, 10x T4 DNA ligase buffer

Oligo REGP-10 was used as a template, while oligos REGP-11 and REGP-12were used as forward and reverse primers, to generate double-strandedDNA fragments encoding a degenerated peptide library by PCR. The firstATG in this fragment encodes Met as the translation initiation site ofthe peptides. Gly, the second amino acid encoding by the DNA fragment,assures the peptides expressed are stable inside cells according to theN-end rule. The following three repeats of NNK (N for A, C, G, and T inequal molar ratio, K for G and T in equal molar ratio) code for allpossible amino acids and thus form the peptide library. Two stop codens,TGA and TAA, are added right after the last NNK to stop peptidetranslation. The 5′-end KpnI site and 3′-end HindIII allows thefragments to be cloned into pCEP4 vector in the correct direction underCMV promoter.

After ligation of KpnI/HindIII restricted PCR product and KpnI/HindIIIrestricted pCEP4, the DNA molecules were electroporated into XL1-bluecells and plated on LB plates with ampicillin. All colonies from theplates were poured together, delivered into small portions and stored at−70□C. Each portion of the transformed cells can be inoculated and starta 1-liter-culture in order to prepared the plasmids by Qiagen Maxi-Prepkit. Insertions were verified by PCR of the generated plasmids witholigos REGPD-11 and REGPD-12. Same PCR reactions were also applied toplasmids prepared from 6 randomly picked colonies so as to estimate thecloning efficiency and the ratio of peptide-encoding plasmids to emptyvectors in the population. Note that all generated peptides will startwith M and G amino acids followed by three random amino acids.

Results

There were ˜25000 colonies on the plates. They were collected into 12portions. Plasmid prepared from one of such portions was examined byPCR.

Gel electrophoresis showed that there were insertions with correct size.PCR results from the 6 randomly picked colonies from the originallibrary plates showed that 4 of the colonies had the right construction(showed correct size of insertion) while two did not. The structure ofthe generated peptide library plasmid is shown in FIG. 7.

C—Screening for Ligands that can Correct p53 Mutants in SW480.7 Cells

SW480.7 is a human colon cancer cell line that carries R273H and P309Sdouble mutations in its p53 gene. R273H is one of the most frequentlyoccurred p53 mutations that have been observed in various cancertissues. It has been reported that the p53 mutant in SW480.7 loses itsfunction in transactivating its downstream genes such as p21, Bax,14-3-3σ, and CD95. Our purpose of screening the peptide libraryconstructed above is to find out one or more peptide ligands that canrestore wild-type p53 function without alteration of the mutant p53 genein the cell and disturbing cells with wild type p53. The choice of ahuman cancer cell line provides the real human environment in which theselected peptides can be modified, degraded, and transported tosubcellular organelles, and thus assures the reliability of the selectedpeptides to function in human body.

Materials and Methods

Human colon cancer cell line SW480.7 with p53 mutations R273H and P309S(ATCC Number: CCL-228, Homo sapiens (human), adenocarcinoma, colorectal;colon), DMEM medium, FCS, PBS, trypsin, puromycin (sigma), hygromycin(sigma), Peptide library (constructed by our laboratory described asabove), pCEP4 vector (Invitrogen), calcium phosphate transfectionreagents (0.1×TE (pH8.0), 2×HBS (pH7.4), 2M CaCl₂, 15% glycerol in HBS).

DNA extraction reagents (TBS buffer (0.8% NaCl, 0.02% KCl, 0.3% Tris,pH7.4), TE buffer (pH8.0), DNA extraction buffer (10 mM TrisCl, 100 mMEDTA, 20 μg/ml RNase, 0.5% SDS, pH8.0), Proteinase K (18 mg/ml),chloroform, phenol (Tris saturated), ethanol, 7.5M ammonium acetate (pH7.4)), XL1-blue electroporation competent cell, pCEP Forward primer, EBVReverse primer, dye-terminator thermocyclic DNA sequencing kit, ALFexpress DNA automatic sequencing system.

The peptide library was transfected into SW480.7 using the calciumphosphate method described in the reporter constructing part. MeanwhilepCEP4 was also transfected into the SW480.7 cells as a negative control.On the second day after transfection, puromycin was added into themedium at the concentration of 0.5 μg/ml. Then the transfected cellswere washed with PBS and changed with fresh medium with 0.5 μg/mlpuromycin every three days. After all the negative control cells died,all the library-transfected cells that still attached to the wells werecollected by scraping into ice-cold TBS buffer.

Rescue of library plasmids was carried out by the extraction of totalDNA from the collected puromycin resistant cells according to Blin andStafford's method. The extracted DNA was then directly electroporatedinto the competent cell XL1-blue and plated on LB plates withampicillin. All the colonies were picked up for plasmid preparation. Theplasmids were first screened by size using agarose gel electrophoresisto get rid of the reporters. The rest of the plasmids were sequenced byALF express DNA automatic sequencing system (Pharmacia) using EBVReverse sequencing primer (Invitrogen) to reveal the DNA sequence, henceamino acid sequence of the selected peptide ligands.

Results

No pCEP4-transfected cells survived in the well after 7 days ofselection with puromycin, while there were still some livinglibrary-transfected cells sticking to the bottom of the well.

Electroporation of the DNA extracted from the library-transfected cellsafter 7 days of selection gave ˜150 colonies. Among them, 50 of thecolonies have the right size of library plasmid and were sequenced. Oneligand peptide sequence was found. The DNA and amino acid sequences are:DNA: 5′-GGTACCAAGAAGGAGTTT ACATATG GGA TGG TGT ACT TGA TAA    KpnI Aminoacid:      M  G  W  C  T (SEQ. ID. NO.: 28) stop stop DNA:GGATCCAAGCTT-3′ (SEQ. ID. NO.: 27) HindIIID—Validation of the Selected Peptide Ligand

The purpose of this experiment is to confirm that the selected peptideligand indeed restores the mutated p53 protein to the wild-type p53function in SW480.7 cells. The selection of peptide ligands was based onpuromycin resistance. There might be a chance that the cells got theresistance with, or even without the help of the selected ligand viamechanisms other than p53-transactivated p21 promoter. Thus we shouldmake sure that the mutant p53 in SW480.7 cells functions as wild-typep53 only if the selected ligand is added. Confirmation/validation wasachieved in two different ways:

-   -   1—We observed the function of the peptide ligand on cell cycle        control of SW480.7 cells, as well as that of U-2 OS (WTp53) and        Saos2 cells (no p53 copy). The selected peptide ligand can be        further verified only if it can induce cell cycle arrest or        apoptosis in SW480.7 cells but neither in U-2 OS nor in Saos2        cells, which means the specificity of the selected ligand to p53        mutation in SW480.7.    -   2—We also tested the transcription levels of genes that are only        transactivated by wild-type p53. If the transcription levels of        these genes are highly induced in ligand treated cells compared        with the negative control (cells transfected only with the empty        vector (pCEP4). The selected ligand/peptide was proved to be        capable of altering the mutant p53 function in SW480.7 into        wild-type function.

P53 (wild-type) regulates different cell responses via transactivatingdifferent downstream genes. For example p53 causes cell cycle arrest atG1 phase by transactivation of p21 gene. While p53 causes cell arrest atG2 phase by transactivation of CD95 gene. Apoptosis is due to p53transactivation of 14-3-3σ and/or Bax gene. Thus we measured p21, CD95,and 14-3-3σ gene transcription levels by RT PCR, to reveal thecapability of the selected peptide ligand in restoring the wild-type p53transactivation of the downstream genes. The expression level of β-actinwas measured as a control. β-Actin transcription level was used tonormalize total mRNA amount between samples for comparison owing to itsconstitutive expression in all cells.

Materials and Methods

SW480.7, U-2 OS, Saos 2, DMEM, McCoy's medium, FCS, PBS, trypsin, trypanblue (Sigma), calcium phosphate transfection reagents, peptide ligand(selected above), pCEP4 (Invitrogen), TRIZOL^(□) Reagent, chloroform,isopropyl alcohol, 75% ethanol (in DEPC-treated water), 0.01% (v/v) DEPCwater, SuperScripII reverse transcriptase (Life Technology), TE buffer(10 mM Tris (pH7.6), 1 mM EDTA), ethanol, 4M ammonium acetate (pH7.0),1-kb DNA ladder, 10×TBE electrophoresis buffer, 1% agarose in 1×TBEbuffer with ethidium bromide.

Primers for RT-PCR are listed in Table 3. TABLE 3 RT-PCR primersequences. Name Sequence RT-WAF1- 5′-CTACCTCAGGCAGCTCAAGC-3′ FOR (SEQ.ID. NO.: 29) RT-HME1- 5′-AGACAGCACCCTCATCATGC-3′ FOR (SEQ. ID. NO.: 30)RT-CD95- 5′-TGGTGCTCATCTTAATGGCC-3′ FOR (SEQ. ID. NO.: 31) RT-ACTIN-5′-TGACAAAACCTAACTTGCGC-3′ FOR (SEQ. ID. NO.: 32) CDS5′-AAGCAGTGGTAACAACGCAGAGTACT₍₃₀₎N⁻¹N-3′ Primer (SEQ. ID. NO.: 33) PCR5′-AAGCAGTGGTAACAACGCAGAGT-3′ Primer (SEQ. ID. NO.: 34)

Target genes and their corresponding RT-PCR product using the aboveprimers are listed in Table 4. TABLE 4 RT-PCR primer target genes andtheir corresponding product. Amplify Size of Target gene region inRT-PCR Primer name Name Function target mRNA product RTWAF1FOR p21 G1arrest 1514-2121 607 bp RTHME1F0R 14-3-3σ G2 arrest  652-1245 593 bpRTCD95FOR CD95 Apoptosis 1968-2534 566 bp RTACTINFOR β-actincytoskeleton 1236-1793 557 bp

Cells SW480.7, U-2OS, and Sa-os2 were transfected with peptide ligandand pCEP4 parallelly using the calcium phosphate method. Thetransfection plan was shown in Table 5: TABLE 5 Transfection plan forthe verification of peptide ligand function. Cell type SW480.7 U-2OSSa-os2 DNA Ligand pCEP4 Ligand pCEP4 Ligand pCEP4 Number of 3 3 1 1 1 16-well plates

After transfection, cells were cultured without drug selection anddiluted at the ratio of 1:100 every five days. Cell proliferation statuswas closely observed until ligand transfected SW480.7 cells died out.Life staining with 0.4% trypan blue was applied to determine the deathof the cells. Dead cells can not exclude the dye thus were stained blue.Living cells can pump the dye outside thus were not stained. Total RNAwas collected from two wells of one ligand transfected SW480.7 plate andone pCEP4 transfected SW480.7 plate each day on day 0, day 1, and day 2after transfection. Thus RNA samples were harvested as SW480.7/Ligand-0day, SW480.7/Ligand-1 day, SW480.7/Ligand-2 day, SW480.7/pCEP4-0 day,SW480.7/pCEP4-1 day, SW480.7/pCEP4-2 day. Protocol for total RNAextraction by TRIZOL® Reagent was according to that given by LifeTechnology:

1. Homogenization

Wash the cell with DEPC water once, then add 1 ml of TRIZOL Reagent perwell. Pass the cell lysate several times through a pipette.

2. Phase Separation

Incubate the homogenized samples for 5 minutes at room temperature topermit the complete dissociation of nucleoprotein complexes. Add 200 μlof chloroform per 1 ml of TRIZOL Reagent. Cap the sample tubes securely.Shake tubes vigorously by hand for 15 seconds and incubate them at roomtemperature for 2 to 3 minutes. Centrifuge the samples at no more than12,000×g for 15 minutes at 2 to 8 C.

3. RNA Precipitation

Transfer the aqueous phase to a fresh tube. Precipitate the RNA byadding 500 μl of isopropyl alcohol per 1 ml of TRIZOL Reagent used forthe initial homogenization. Incubate samples at room temperature for 10minutes and centrifuge at no more than 12,000×g for 10 minutes at 2 to 8C.

4. RNA Wash

Remove the supernatant. Wash the RNA pellet once with 75% ethanol,adding at least 1 ml of 75% ethanol per 1 ml of TRIZOL Reagent used forthe initial homogenization. Mix the sample by vortexing and centrifugeat no more than 7,500×g for 5 minutes at 2 to 8 C.

5. Re-Dissolving the RNA

Briefly dry the RNA pellet. Dissolve RNA in DEPC water by passing thesolution several times through a pipette tip, and incubation for 10minutes at 55 to 60 C. Determine the concentration by measure OD₂₆₀.Store at −70 C.

RT-PCR:

First-Strand cDNA Synthesis

1. For each RNA sample, combine the following reagents in a sterile0.2-ml reaction tube: 1-3 μl RNA sample (using same amount of RNA amongsamples) 1 μl CDS primer (10 μM) 1 μl RT forward primer (10 μM) x μlDeionized H₂O 5 μl Total volume

2. Mix and spin the tube briefly.

3. Incubate the tube at 70 C in a thermal cycler for 2 min.

4. Spin the tube briefly to collect contents at the bottom. Keep tube atroom temperature.

5. Add the following to each reaction tube: 2 μl 5x First-Strand Buffer1 μl DTT (20 mM) 1 μl 50x dNTP (10 mM) 1 μl SuperScripII reversetranscriptase (200 units/μl)

6. Gently vortex and spin the tubes briefly.

7. Incubate the tubes at 42 C for 1 hr in an air incubator or cycler.

8. Add 40 μl TE buffer (10 mM Tris (pH7.6), 1 mM EDTA) to dilute thefirst-strand reaction product.

9. Heat tubes at 72 C for 7 min to inactivate reverse transcriptase.

10. Samples can be stored at −20 C for up to three months.

cDNA Amplification by PCR

1. Preheat the PCR thermal cycler to 95 C.

2. For each reaction, 5 μl of diluted first-strand cDNA and 5 μl ofdeionized water are added to a labeled 0.2-ml reaction tube.

3. Prepare a master mix for all reaction tubes, plus one additionaltube. For each reaction: 20.5 μl Deionized water 5 μl 10x PCR buffer 1μl 50x dNTP (10 mM) 1.5 μl PCR primer (10 μM) 1.5 μl RT forward primer(10 μM) 10 μl 5x Q solution 0.5 μl HotStart Taq DNA polymerase 40 μlTotal volume

4. Mix by vortexing and spin the tube briefly.

5. Aliquot 40 μl of the PCR Master Mix into each tube from Step 2.

6. Cap the tube, and place it in the preheated thermal cycler.

7. PCR was carried out using the following cycling parameters:

7. PCR was carried out using the following cycling parameters: 95 C. 15min 2 cycles 94 C. 30 sec 70 C. 30 sec 72 C. 1 min 2 cycles 94 C. 30 sec69 C. 30 sec 72 C. 1 min 2 cycles 94 C. 30 sec 67 C. 30 sec 72 C. 1 min2 cycles 94 C. 30 sec 65 C. 30 sec 72 C. 1 min 2 cycles 94 C. 30 sec 63C. 30 sec 72 C. 1 min 2 cycles 94 C. 30 sec 61 C. 30 sec 72 C. 1 min 2cycles 94 C. 30 sec 59 C. 30 sec 72 C. 1 min 2 cycles 94 C. 30 sec 57 C.30 sec 72 C. 1 min 40 cycles 94 C. 30 sec 55 C. 30 sec 72 C. 1 min 1cycle  72 C. 7 min  4 C. store

8. Electrophoreses 5 μl of each PCR reaction alongside 0.1 μg of 1-kbDNA ladder on a 1.2% agarose/EtBr gel in 1×TBE buffer.

Results

After transfecting the ligand into SW480.7 cells for 15 days withoutpuromycin selection, the cells could no longer attach to the wall anddied. On the contrary, the ligand-transfected U2-OS and Saos-2 cellswere growing well.

RT-PCR analysis of the mRNA from ligand transfected SW480.7 cells showedthat this ligand can reestablish wild type p53 function. Higher mRNAlevels of p21 (WAF1/Cip1), 14-3-3 and CD95 genes were found comparedwith that in the empty pCEP4 transfected SW480.7 cells. This proved thatthe apoptosis and cell growth arrest of ligand transfected cells wereowing to the ligand that corrected R273H/P309S p53 mutant to wild typefunction.

Example 3 New Reporter System Utilising the Bax Promoter

The previous p53 reporter p21ur is based on the transactivation activityof wild type p53 to p21 promoter. It has been proved that up-regulationof p21 protein expression will result in cell growth arrest, but notapoptosis. Another p53 target, Bax (a cell death gene), can lead toapoptosis once its transcription and expression is upregulated. Bax isone of the key mediators for p53 to induce cell apoptosis—to kill thecells that have failed to repair their damaged genomes.

Bax promoter is different from p21 promoter, though both have p53consensus binding sequence. It has been reported that some p53 mutantcan transactivate p21 promoter, but not Bax. This indicates a morestringent requirement of p53 conformation for the transactivation of Baxpromoter.

Green fluorescence protein (GFP) has also been used in the new reportersystem. Bax promoter drives a long transcript of both puromycinresistant gene and GFP gene. By inserting an IRES fragment between thetwo genes that introduces a second ribosome-binding site, they can betranslated into protein simultaneously from one transcript (ClontechpIRESneo manual). Thus the green color from GFP will visualize the Baxpromoter activity, hence the ligand's activity to correct p53 mutants,in addition to the positive selection of puromycin resistance. The greenfluorescence will eliminate the background caused by mechanisms whichhave cells developed to survive from puromycin selection other thantransactivation of Bax promoter.

Selection marker is important for integration of the plasmid intogenome. A stable p53-null cell line with integrated reporter system willbe convenient for ligand screening, especially for non-peptide ligands.

Materials and Methods

Plasmids used in this Example are: p21ur (above), pIRESneo (Clontech),pEGFP-C1 (Clontech), and pREP8 (Invitrogen).

Genomic DNA (human, male, normal)

Restriction endonucleases (Promega): AgeI, EcoRV, HindIII, NruI, PvuI,SacI, SalI, SpeI, XbaI, XhoI, XmaI.

Modification enzymes: Vent DNA polymerase (NEB), T4 DNA ligase(Promega), and Shrimp alkaline phosphotase (SAP). Bax promoter primer:Bax promoter Forward: 5′-atctaagcttgaggcttcagcccgggaat (SEQ. ID. NO.:35) tccag-3′      HindIII Bax promoter Reverse:5′-atctaccggtgccagcagtggcgccgtcc (SEQ. ID. NO.: 36) aacag-3′      AgeIEGFP primer: EGFP Forward: 5′-aataacccgGGTCGCCACCATGGTGAGCA (SEQ. ID.NO.: 37) AG-3′           XmaI EGFP Reverse:5′-aataatctagaACTTGTACAGCTCGTCCA (SEQ. ID. NO.: 38) TGCCG-3′          XbaI         stop

Construction of pBaxur-Bax promoter controlled puromycin resistant gene:

-   -   Making PCR from normal human genomic DNA using Bax promoter        primers. The PCR product is 500 bp. It has a SmaI site near its        5′-end and a SacI site near its 3′-end.

Cut PCR product Bax promoter with HindIII and AgeI.

-   -   Cut p21 ur with HindIII and AgeI, and dephosphorylate        afterwards. Purify the 3.9 kb fragment from gel.

Ligate Bax promoter into the 3.9 kb fragment so that Bax promoter cancontrol the puromycin resistant gene expression in mammalian cells. Thefinal product is denoted as pBaxur.

Construction of pBaxurEGFP and p21 urEGFP-Green fluorescent proteindownstream of puromycin resistant gene that visualizes the transcriptionlevel of Bax or p21 promoter:

-   -   Amplify EGFP gene fragment from pEGFP-C1 vector by PCR with the        EGFP primers.    -   The PCR product is around 750 bp. Cut the product with XmaI and        XbaI, then purify it from gel.

Cut pIRESneo with XmaI and XbaI, then purify the 4.5 kb fragment fromgel.

Ligate PCR product with pIRESneo 4.5 kb fragment to generate vectorpIRES-EGFP (5.2 kb).

Cut p21ur or pBaxur with EcoRV and XbaI, and purify the 3.2 kb or 1.3 kbfragment from gel.

Cut pIRES-EGFP with NruI and SpeI, dephosphorylate, and purify the 4.4kb fragment from gel.

Ligate the two fragments to generate transient p53-double-reporter p21urEGFP (7.6 kb) or pBaxurEGFP (5.7 kb).

Construction of pBaxurEGFP-hisD-Integrative plasmid with selection markas histidinol D resistant:

-   -   Cut pREP8 with PvuI, SalI, and SacI, purify the 5194 bp fragment        from gel.    -   Cut pBaxEGFP with PvuI and XhoI, dephosphorylated, and purify        the 3972 bp fragment from gel.

Ligate the hisD fragment into pBaxurEGFP fragment to generateintegrative p53-double reporter pBaxurEGFP_hisD.

The structure of p21 urEGFP was verified by restriction of HindIII,which should give five bands of 3489 bp, 2339 bp, 1041 bp, 360 bp, and348 bp. Agarose gel electrophoresis of the restricted plasmid showed theexpected bands. The structure of pBaxur was verified by restriction ofeither HindIII together with SacI, which should give two bands of 3996bp, and 388 bp, or SmaI alone, which should give three bands of 3184 bp,620 bp, and 580 bp. Agarose gel electrophoresis of the restrictedplasmid showed the expected bands.

The structure of pBaxurEGFP was verified by restriction of either EcoRItogether with XbaI, which should give two bands of 2907 bp, and 2766 bp,or SacI alone, which should give two bands of 3125 bp, and 2548 bp.Agarose gel electrophoresis of the restricted plasmid showed theexpected bands.

The structure of pBaxurEGFP-hisD was verified by restriction of eitherEcoRI, which should give two bands of 5955 bp, and 3211 bp, or SacI,which should give two bands of 6618 bp, and 2548 bp. Agarose gelelectrophoresis of the restricted plasmid showed the expected bands.

Screening for new peptide ligands is done by first transfecting theoutlined new reporter p21 urEGFP-hisD along with the cotransfection ofthe peptide library constructed above into cancer cell lines such asSW480.7. Caspase inhibitor VAD-fmk (ApoAlert®, Clontech) is added to thecell culture at the final concentration of 40 mM in order to preventapoptisis in the case of the overexpression of BAX gene. Also 0.5 g/mlpuromycin is added in the media one-day after transfection. Afterseveral days, surviving green cells (excited at 488 nm wavelength) willbe the cells containing the candidate ligands. Rescued plasmids fromthese cells are subjected to DNA sequencing to deduce the amino acidsequences of the encoded peptide ligands.

When screening non-peptide ligands, a stable cell line of the reporterp21urEGFP-hisD is constructed. Members of the chemical compound libraryare added to the cell culture. The cells should also be cultured in themedium containing 40 mM VAD-fmk and 0.5 g/ml puromycin. Surviving greencells after excitation at 488 nm is the indicator for successfulcandidate ligands.

1. A method of screening a library of peptides of formulaM-G/M/V—(X)_(n) wherein n is an integer from 3 to 18, M is methionine, Gis glycine, V is valine and each X, which may be the same or different,is any genetically encoded amino acid, which method comprises: a)transforming a host cell population with a library of nucleic acidconstructs that expresses free peptides of formula M-G/M/V—(X)_(n); b)culturing the transformed host cells under conditions suitable forintracellular expression of the peptides of formula M-G/M/V—(X)_(n); andc) analyzing the host cell population to determine the effect of thepeptides of formula M-G/M/V—(X)_(n) on a reporter system. 2-27.(canceled)
 28. The method of claim 1 wherein n is an integer from 3 to5.
 29. The method of claim 1 wherein the reporter system includes atarget protein.
 30. The method of claim 29 wherein the target protein isa nucleic acid binding protein.
 31. The method of claim 30 wherein thenucleic acid binding protein is p53.
 32. The method of claim 1 whereinthe reporter system comprises a reporter gene.
 33. The method of claim32 wherein the reporter gene is operably linked to a sequence ofnucleotides that provides a binding site for a target protein or for aprotein that associates with or is a substrate for a target protein. 34.The method of claim 33 wherein the reporter gene is operably linked to ap21 or Bax promoter.
 35. The method of claim 32 wherein the proteinproduct of the reporter gene includes a secretion signal peptide. 36.The method of claim 32 wherein the protein product of the reporter geneincludes a transmembrane domain.
 37. The method of claim 32 wherein thehost cells have been transfected with the reporter gene.
 38. The methodof claim 1 wherein the peptide library has at least 500 differentmembers.
 39. The method of claim 1 wherein the host cells are eukaryoticcells.
 40. A method of identifying a peptide ligand of formulaM-G/M/V—(X)_(n) wherein n is an integer from 3 to 18, M is methionine, Gis glycine, V is valine, and each X, which may be the same or different,is any genetically encoded amino acid, having a desired activity on atarget protein, which method comprises: a) transforming a host cellpopulation with a library of nucleic acid constructs which expressespeptides of formula M-G/M/V—(X)_(n); b) culturing the transformed hostcells under conditions suitable for intra-cellular expression of thepeptides of formula M-G/M/V—(X)_(n); c) analyzing the host cellpopulation to determine the effect of the expressed peptides on thetarget protein, wherein the target protein forms part of a reportersystem; and d) identifying the peptide of formula M-G/M/V—(X)_(n) inthose cells in which the reporter system indicates a positive response.41. The method of claim 40 wherein n is an integer from 3 to
 5. 42. Themethod of claim 40 wherein the reporter system comprises a reportergene.
 43. The method of claim 42 wherein the reporter gene is operablylinked to a sequence of nucleotides which provides a binding site for atarget protein or for a protein which associates with or is a substratefor a target protein.
 44. The method of claim 43 wherein the reportergene is operably linked to a p21 or Bax promoter.
 45. The method ofclaim 42 wherein the protein product of the reporter gene includes asecretion signal peptide.
 46. The method of claim 42 wherein the proteinproduct of the reporter gene includes a transmembrane domain.
 47. Themethod of claim 42 wherein the host cells have been transfected withreporter gene.
 48. The method of claim 40 wherein the peptide libraryhas at least 500 different members.
 49. The method of claim 40 whereinthe host cells are eukaryotic cells.
 50. A library of nucleic acidconstructs which expresses free peptides in an intra-cellularenvironment, the peptides having the sequence M-G/M/V—(X)_(n), wherein nis an integer from 3 to 18, M is methionine, G is glycine, V is valineand each X, which may be the same or different, is any geneticallyencoded amino acid.
 51. The library of claim 50 wherein each nucleicacid construct contains a promoter region which is operably linked tothe sequence which encodes the peptide of formula M-G/M/V—(X)_(n). 52.The library of claim 51 wherein the nucleic acid constructs are in theform of expression vectors.
 53. The library of claim 52 wherein theexpression vectors are capable of autonomous replication.
 54. Thelibrary of claim 50 wherein n is an integer from 3 to
 10. 55. Thelibrary of claim 50 which has at least 500 different members.
 56. Thelibrary of claim 55 which has at least 2000 different members.
 57. Thelibrary of claim 50 wherein the value of n is the same for each memberof the library.
 58. A library of peptides, each member of the libraryhaving the amino acid sequence M-G/M/V—(X)_(n), wherein n is an integerfrom 3 to 18, M is methionine, G is glycine, V is valine and each X,which may be the same or different, is any genetically encoded aminoacid.
 59. The library of claim 58 wherein n is an integer from 3 to 10.60. The library of claim 59 wherein n is an integer from 3 to
 5. 61. Thelibrary of claim 58 which has at least 500 different members.
 62. Thelibrary of claim 61 which has at least 2000 different members.
 63. Thelibrary of claim 58 wherein the value of n is the same for each memberof the library.
 64. A method of generating a library of nucleic acidconstructs which expresses free peptides in an intracellularenvironment, which method comprises synthesizing a library of DNAmolecules which include the nucleotide sequence ATGGGA (NNK)_(n),wherein n is an integer from 3 to 18, N is A, C, T or G and K is G or Tand wherein each NNK triplet may be the same or different, and insertinga library of synthesized DNA fragments which each includes a nucleotidesequence of formula ATGGGA (NNK)_(n) into expression vectors.
 65. Themethod of claim 64 wherein during synthesis of the library of DNAmolecules equimolar amounts of nucleotides A, C, T and G are added forincorporation at each N position and equimolar amounts of nucleotides Gand T are added for incorporation at each K position.
 66. The library ofnucleic acid constructs prepared according to a method of claim
 64. 67.The library of nucleic acid constructs prepared according to a method ofclaim 65.